303 research outputs found

    Degree-degree correlations in random graphs with heavy-tailed degrees

    Get PDF
    Mixing patterns in large self-organizing networks, such as the Internet, the World Wide Web, social and biological networks are often characterized by degree-degree {dependencies} between neighbouring nodes. One of the problems with the commonly used Pearson's correlation coefficient (termed as the assortativity coefficient) is that {in disassortative networks its magnitude decreases} with the network size. This makes it impossible to compare mixing patterns, for example, in two web crawls of different size. We start with a simple model of two heavy-tailed highly correlated random variable XX and YY, and show that the sample correlation coefficient converges in distribution either to a proper random variable on [1,1][-1,1], or to zero, and if X,Y0X,Y\ge 0 then the limit is non-negative. We next show that it is non-negative in the large graph limit when the degree distribution has an infinite third moment. We consider the alternative degree-degree dependency measure, based on the Spearman's rho, and prove that it converges to an appropriate limit under very general conditions. We verify that these conditions hold in common network models, such as configuration model and Preferential Attachment model. We conclude that rank correlations provide a suitable and informative method for uncovering network mixing patterns

    Minimal chordal sense of direction and circulant graphs

    Full text link
    A sense of direction is an edge labeling on graphs that follows a globally consistent scheme and is known to considerably reduce the complexity of several distributed problems. In this paper, we study a particular instance of sense of direction, called a chordal sense of direction (CSD). In special, we identify the class of k-regular graphs that admit a CSD with exactly k labels (a minimal CSD). We prove that connected graphs in this class are Hamiltonian and that the class is equivalent to that of circulant graphs, presenting an efficient (polynomial-time) way of recognizing it when the graphs' degree k is fixed

    Node Labels in Local Decision

    Get PDF
    The role of unique node identifiers in network computing is well understood as far as symmetry breaking is concerned. However, the unique identifiers also leak information about the computing environment - in particular, they provide some nodes with information related to the size of the network. It was recently proved that in the context of local decision, there are some decision problems such that (1) they cannot be solved without unique identifiers, and (2) unique node identifiers leak a sufficient amount of information such that the problem becomes solvable (PODC 2013). In this work we give study what is the minimal amount of information that we need to leak from the environment to the nodes in order to solve local decision problems. Our key results are related to scalar oracles ff that, for any given nn, provide a multiset f(n)f(n) of nn labels; then the adversary assigns the labels to the nn nodes in the network. This is a direct generalisation of the usual assumption of unique node identifiers. We give a complete characterisation of the weakest oracle that leaks at least as much information as the unique identifiers. Our main result is the following dichotomy: we classify scalar oracles as large and small, depending on their asymptotic behaviour, and show that (1) any large oracle is at least as powerful as the unique identifiers in the context of local decision problems, while (2) for any small oracle there are local decision problems that still benefit from unique identifiers.Comment: Conference version to appear in the proceedings of SIROCCO 201

    Fast Quasi-Threshold Editing

    Full text link
    We introduce Quasi-Threshold Mover (QTM), an algorithm to solve the quasi-threshold (also called trivially perfect) graph editing problem with edge insertion and deletion. Given a graph it computes a quasi-threshold graph which is close in terms of edit count. This edit problem is NP-hard. We present an extensive experimental study, in which we show that QTM is the first algorithm that is able to scale to large real-world graphs in practice. As a side result we further present a simple linear-time algorithm for the quasi-threshold recognition problem.Comment: 26 pages, 4 figures, submitted to ESA 201

    LiveRank: How to Refresh Old Crawls

    Get PDF
    International audienceThis paper considers the problem of refreshing a crawl. More precisely, given a collection of Web pages (with hyperlinks) gathered at some time, we want to identify a significant fraction of these pages that still exist at present time. The liveness of an old page can be tested through an online query at present time. We call LiveRank a ranking of the old pages so that active nodes are more likely to appear first. The quality of a LiveRank is measured by the number of queries necessary to identify a given fraction of the alive pages when using the LiveRank order. We study different scenarios from a static setting where the LiveRank is computed before any query is made, to dynamic settings where the LiveRank can be updated as queries are processed. Our results show that building on the PageRank can lead to efficient LiveRanks for Web graphs

    The Number of Convex Permutominoes

    Get PDF
    Permutominoes are polyominoes defined by suitable pairs of permutations. In this paper we provide a formula to count the number of convex permutominoes of given perimeter. To this aim we define the transform of a generic pair of permutations, we characterize the transform of any pair defining a convex permutomino, and we solve the counting problem in the transformed space

    Spectral centrality measures in complex networks

    Full text link
    Complex networks are characterized by heterogeneous distributions of the degree of nodes, which produce a large diversification of the roles of the nodes within the network. Several centrality measures have been introduced to rank nodes based on their topological importance within a graph. Here we review and compare centrality measures based on spectral properties of graph matrices. We shall focus on PageRank, eigenvector centrality and the hub/authority scores of HITS. We derive simple relations between the measures and the (in)degree of the nodes, in some limits. We also compare the rankings obtained with different centrality measures.Comment: 11 pages, 10 figures, 5 tables. Final version published in Physical Review

    Ranking and clustering of nodes in networks with smart teleportation

    Get PDF
    Random teleportation is a necessary evil for ranking and clustering directed networks based on random walks. Teleportation enables ergodic solutions, but the solutions must necessarily depend on the exact implementation and parametrization of the teleportation. For example, in the commonly used PageRank algorithm, the teleportation rate must trade off a heavily biased solution with a uniform solution. Here we show that teleportation to links rather than nodes enables a much smoother trade-off and effectively more robust results. We also show that, by not recording the teleportation steps of the random walker, we can further reduce the effect of teleportation with dramatic effects on clustering.Comment: 10 pages, 7 figure

    Efficiently Clustering Very Large Attributed Graphs

    Full text link
    Attributed graphs model real networks by enriching their nodes with attributes accounting for properties. Several techniques have been proposed for partitioning these graphs into clusters that are homogeneous with respect to both semantic attributes and to the structure of the graph. However, time and space complexities of state of the art algorithms limit their scalability to medium-sized graphs. We propose SToC (for Semantic-Topological Clustering), a fast and scalable algorithm for partitioning large attributed graphs. The approach is robust, being compatible both with categorical and with quantitative attributes, and it is tailorable, allowing the user to weight the semantic and topological components. Further, the approach does not require the user to guess in advance the number of clusters. SToC relies on well known approximation techniques such as bottom-k sketches, traditional graph-theoretic concepts, and a new perspective on the composition of heterogeneous distance measures. Experimental results demonstrate its ability to efficiently compute high-quality partitions of large scale attributed graphs.Comment: This work has been published in ASONAM 2017. This version includes an appendix with validation of our attribute model and distance function, omitted in the converence version for lack of space. Please refer to the published versio

    Estimating latent feature-feature interactions in large feature-rich graphs

    Get PDF
    Real-world complex networks describe connections between objects; in reality, those objects are often endowed with some kind of features. How does the presence or absence of such features interplay with the network link structure? Although the situation here described is truly ubiquitous, there is a limited body of research dealing with large graphs of this kind. Many previous works considered homophily as the only possible transmission mechanism translating node features into links. Other authors, instead, developed more sophisticated models, that are able to handle complex feature interactions, but are unfit to scale to very large networks. We expand on the MGJ model, where interactions between pairs of features can foster or discourage link formation. In this work, we will investigate how to estimate the latent feature-feature interactions in this model. We shall propose two solutions: the first one assumes feature independence and it is essentially based on Naive Bayes; the second one, which relaxes the independence assumption assumption, is based on perceptrons. In fact, we show it is possible to cast the model equation in order to see it as the prediction rule of a perceptron. We analyze how classical results for the perceptrons can be interpreted in this context; then, we define a fast and simple perceptron-like algorithm for this task, which can process 108108 links in minutes. We then compare these two techniques, first with synthetic datasets that follows our model, gaining evidence that the Naive independence assumptions are detrimental in practice. Secondly, we consider a real, large-scale citation network where each node (i.e., paper) can be described by different types of characteristics; there, our algorithm can assess how well each set of features can explain the links, and thus finding meaningful latent feature-feature interactions
    corecore